Explainability methods for NNs
Resources
- Using ML to Explore Neural Network Architecture
- The Building Blocks of Interpretability
- Feature Visualization
- Applying deep learning to real-world problems (labeled data, imbalance, black box models)
- Unblackboxing webinar (deepsense.io)
- The Dark Secret at the Heart of AI
- How AI detectives are cracking open the black box of deep learning
- Visualization of activations and filters
- https://towardsdatascience.com/understanding-your-convolution-network-with-visualizations-a4883441533b
- https://imatge.upc.edu/web/publications/visual-saliency-prediction-using-deep-learning-techniques
- Attributing a deep network’s prediction to its input features
- Integrated gradients method
- It involves a few calls to a gradient operator yielding insightful results for a variety of deep networks
Saliency maps
- https://en.wikipedia.org/wiki/Saliency_map
- Saliency map is a broader term from the field of computer vision. The first reference of saliency maps applied to the predictions of DNNs is Morch et al 1995. Simonyan et al (2014) first proposed a method to produce saliency maps using back-propagation through a CNN, but note that you could compute "saliency" from an image in many ways that do not deal with back-propagating the prediction scores of DNNs.
- Pixel Attribution (Saliency Maps)
Layer-wise Relevance Propagation (LRP)
- LRP is an inverse method which calculates the contribution of a single pixel to the prediction made by a DNN in an image classification task
- http://heatmapping.org/
- Interactive demo
- https://medium.com/@ODSC/layer-wise-relevance-propagation-means-more-interpretable-deep-learning-219ff5158914
- https://towardsdatascience.com/indepth-layer-wise-relevance-propagation-340f95deb1ea
- There are several approaches for calculating attributions by back-propagating the prediction score through each layer of the network, back to the input features /pixels (DeConvNet, SmoothGrad, GradCam, LRP, XRAI). LRP is just one of them. In the first LRP paper, they talk about heatmaps or relevance maps, probably to avoid confusion with older saliency map techniques
Courses
- #COURSE Interpretable Machine Learning for Computer Vision (CVPR 2020)
- #COURSE Explainability Methods for Neural Networks (2021)
Code
- #CODE Xplique
- Python toolkit dedicated to explainability, currently based on Tensorflow
- https://deel-ai.github.io/xplique/
- #PAPER Xplique: A Deep Learning Explainability Toolbox (Fel 2022)
- #CODE Quantus
- Quantus is an eXplainable AI toolkit for responsible evaluation of neural network explanations
- #CODE TruLens (tf.keras and pytorch): Explainability for Neural Networks
- #CODE Captum (pytorch)
- Interpretability of models across modalities including vision, text, and more
- https://captum.ai/
- https://captum.ai/api/
- #CODE Explainable-cnn
- PyTorch based visualization package for generating layer-wise explanations for CNNs
- #CODE Saliency
- XRAI, SmoothGrad, Vanilla Gradients, Guided Backpropogation, Integrated Gradients, Occlusion, Grad-CAM, Blur IG
- #CODE iNNvestigate
- Vanilla gradient, SmoothGrad, DeConvNet, Guided BackProp, PatternNet, DeepTaylor, PatternAttribution, LRP, IntegratedGradients, DeepLIFT
- #CODE TF-explain
- implements interpretability methods as Tensorflow 2.x callbacks to ease neural network's understanding
- #CODE TensorSpace (Tensorflow.js)
- Neural network 3D visualization framework
- https://tensorspace.org
- #CODE Lucid (Tensorflow 1) - A collection of infrastructure and tools for research in neural network interpretability
- #CODE tf-keras-vis
- Neural network visualization toolkit for tf.keras
- Activation Maximization
- Class Activation Maps (GradCAM, GradCAM++, ScoreCAM, Faster-ScoreCAM)
- Saliency Maps (Vanilla Saliency, SmoothGrad)
- #CODE Keras-vis
- https://raghakot.github.io/keras-vis/
- Activation maximization, Saliency maps, Class activation maps
- #CODE DeepExplain (TensorFlow 1)
- Saliency maps, Gradient * Input, Integrated Gradients, DeepLIFT, ε-LRP
- #CODE LRP toolbox
References
- #PAPER Visualization of neural networks using saliency maps (Morch 1995)
- #PAPER Deep inside CNNs: Visualising Image Classification Models and Saliency Maps (Simonyan 2014)
- Presented two visualisation techniques for deep classification ConvNets
- The first generates an artificial image, which is representative of a class of interest
- The second computes an image-specific class saliency map, highlighting the areas of the given image, discriminative wrt the given class
- Presented two visualisation techniques for deep classification ConvNets
- #PAPER Understanding Neural Networks Through Deep Visualization (Yosinski 2015)
- #PAPER SVCCA: Singular Vector Canonical Correlation Analysis for Deep Learning Dynamics and Interpretability (Raghu 2017)
- #PAPER Axiomatic Attribution for Deep Networks (Sundararajan 2017)
- #PAPER SmoothGrad: removing noise by adding noise (Smilkov 2017)
- #PAPER iNNvestigate Neural Networks! (Alber 2018)
- #PAPER XRAI: Better Attributions Through Regions (Kapishnikov 2019)
- #PAPER DeepLIFT - Learning Important Features Through Propagating Activation Differences (Shrikumar 2019)
- #PAPER Saliency Prediction in the Deep Learning Era: Successes, Limitations, and Future Challenges (Borji 2019)
- #PAPER DAX: Deep Argumentative eXplanation for Neural Networks (Albini 2020)
- #PAPER Interpreting Deep Neural Networks Through Variable Importance (Ish-Horowicz 2020)
- Their strategy is specifically designed to leverage partial covariance structures and incorporate variable interactions into our proposed feature ranking.
- Extended the recently proposed “RelATive cEntrality” (RATE) measure (Crawford et al., 2019) to the Bayesian deep learning setting
- Given a trained network, RATE applies an information theoretic criterion to the posterior distribution of effect sizes to assess feature significance
- #PAPER Determining the Relevance of Features for Deep Neural Networks (Reimers 2020)
- Their approach builds upon concepts from causal inference
- Interpret machine learning in a structural causal model and use Reichenbach’s common cause principle to infer whether a feature is relevant
- #PAPER Explainable Deep Learning Models in Medical Image Analysis (Singh 2020)
- #PAPER Efficient Saliency Maps for Explainable AI (Mundhenk 2020)
- #PAPER Explaining Deep Neural Networks and Beyond: A Review of Methods and Applications (Samek 2021)
- #PAPER Logic Explained Networks (Ciravegna 2021)
- #PAPER Toward Explainable AI for Regression Models (Letzgus 2021)
- #PAPER Explaining in Style: Training a GAN to explain a classifier in StyleSpace (Lang 2021)
- #PAPER Variable selection with false discovery rate control in deep neural networks (Song 2021)
Layer-wise Relevance Propagation (LRP)
- #PAPER On Pixel-Wise Explanations for Non-Linear Classifier Decisions by Layer-Wise Relevance Propagation (Bach 2015)
- #PAPER Understanding Individual Decisions of CNNs via Contrastive Backpropagation (Gu 2019)
- #PAPER Beyond saliency: understanding convolutional neural networks from saliency prediction on layer-wise relevance propagation (Li 2019)
- proposed a novel two-step understanding method, namely Salient Relevance (SR) map, which aims to shed light on how deep CNNs recognize images and learn features from attention areas
- starts out with a layer-wise relevance propagation (LRP) step which estimates a pixel-wise relevance map over the input image. Following, we construct a context-aware saliency map, SR map, from the LRP-generated map which predicts areas close to the foci of attention instead of isolated pixels that LRP reveals
- #PAPER Towards Best Practice in Explaining Neural Network Decisions with LRP (Kohlbrenner 2020)